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VieM nf the Invention 
The present invention describes nucleic acid sequences for eukaryotic genes encoding 
e lycopene e-cyclase (also known as e-cyclase and e lycopene cyclase), isopentenyl 
pyrophosphate isomerase (IPP) and p-carotene hydroxylase as well as vectors containing the 
same and hosts transformed with said vectors. The present invention also provides methods 
for augmenting the accumulation of carotenoids, changing the composition of the 
carotenoids, and producing novel and rare carotenoids. The present invention provides 
methods for controlling the ratio or relative amounts of various carotenoids in a host. The 
invention also relates to modified lycopene e-cyclase, IPP isomerase and p-carotene 
hydroxylase. Additionally, the present invention provides a method for screening for genes 
and cDNAs encoding enzymes of carotenoid biosynthesis and metabolism. 

Rar.lfgmnn ri A f thft Invention 
Carotenoid pigments with cyclic endgroups are essential components of the 
photosynthetic apparatus in oxygenic photosynthetic organisms (e.g., cyanobacteria, algae 
and plants; Goodwin, 1980). The symmetrical bicyclic yellow carotenoid pigment p- 
carotene (or, in rare cases, the asymmetrical bicyclic a -carotene) is intimately associated with 
the photosynthetic reaction centers and plays a vital role in protecting against potentially 
lethal photooxidative damage (Koyama, 1991). P-carotene and other carotenoids derived 
from it or from o-carotene also serve as light-harvesting pigments (Siefermann-Harms, 
1987), are involved in the thermal dissipation of excess light energy captured by the light- 
harvesting antenna (Demmig- Adams & Adams, 1992), provide substrate for the biosynthesis 
of the plant growth regulator abscisic acid (Rock & Zeevaart, 1 99 1 ; Parry & Horgan, 1 99 1 ), 
and are precursors of vitamin A in human and animal diets (Krinsky, 1987). Plants also 
exploit carotenoids as coloring agents in flowers and fruits to attract pollinators and agents of 
seed dispersal (Goodwin, 1980). The color provided by carotenoids is also of agronomic 
value in a number of important crops. Carotenoids are currently harvested from a variety of 
organisms, including plants, algae, yeasts, cyanobacteria and bacteria, for use as pigments in 
food and feed. 
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The probable pathway for formation of cyclic carotenoids in plants, algae and 
cyanobacteria is illustrated in Figure 1 . Two types of cyclic endgroups or rings are 
commonly found in higher plant carotenoids, these are referred to as the p (beta) and e 
(epsilon) rings (Fig. 3). The precursor acyclic endgroup (no ring structure) is referred to as 
the Y (psi) endgroup. The p and e endgroups differ only in the position of the double bond in 
the ring. Carotenoids with two p rings are ubiquitous, and those with one p and one e ring 
are common, but carotenoids with two e rings are uncommon. P-carotene (Fig. 1) has two p- 
endgroups and is a symmetrical compound that is the precursor of a number of other 
important plant carotenoids such as zeaxanthin and violaxanthin (Fig. 2). 

Genes encoding enzymes of carotenoid biosynthesis have previously been isolated 
from a variety of sources including bacteria (Armstrong et al., 1989, Mol. Gen. Genet. 216, 
254-268; Misawa et al., 1990, J. Bacteriol., 172, 6704-12), fungi (Schmidhauser et al., 1990, 
Mol. Cell. Biol. 1 0, 5064-70), cyanobacteria (Chamovitz et al., 1 990, Z. Naturforsch, 45c, 
482-86; Cunningham et al., 1994) and higher plants (Bartley et al., Proc. Natl. Acad. Sci USA 
88, 6532-36; Martinez-Ferez & Vioque, 1992, Plant Mol. Biol. 18,981-83). Many of the 
isolated enzymes show a great diversity in structure, function and inhibitory properties 
between sources. For example, phytoene desaturases from the cyanobacterium 
Synechococcus and from higher plants and green algae carry out a two-step desaturation to 
yield C-carotene as a reaction product. In plants and cyanobacteria a second enzyme (C- 
carotene desaturase), similar in amino acid sequence to the phytoene desaturase, catalyzes 
two additional desaturations to yield lycopene. In contrast, a single desaturase enzyme from 
Erwinia herbicola and from other bacteria introduces all four double bonds required to form 
lycopene. The Erwinia and other bacterial desaturases bear little amino acid sequence 
similarity to the plant and cyanobacterial desaturase enzymes, and are thought to be of 
unrelated ancestry. Therefore, even with a gene in hand from one source, it may be difficult to 
identify a gene encoding an enzyme of similar function in another organism. In particular, 
the sequence similarity between certain of the prokaryotic and eukaryotic genes encoding 
enzymes of carotenoid biosynthesis is quite low. 

Further, the mechanism of gene expression in prokaryotes and eukaryotes appears to 
differ sufficiently such that one cannot expect that an isolated eukaryotic gene will be 
properly expressed in a prokaryotic host. 
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The difficulties in isolating genes encoding enzymes with similar functions is 
exemplified by recent efforts to isolate the gene encoding the enzyme that catalyzes the 
formation of p-carotene from the acyclic precursor lycopene. Although a gene encoding an 
enzyme with this function had been isolated from a bacterium, it had not been isolated from 
any photosynthetic procaryote or from any eukaryotic organism. The isolation and 
characterization of the enzyme catalyzing formation of p-carotene in the cyanobacterium 
Synechococcus PCC7942 was described by the present inventors and others (Cunningham et 
al., 1993 and 1994). The amino acid sequence similarity of the cyanobacterial enzyme to the 
various bacterial lycopene p-cyclases is so low («,. 18-25% overall; Cunningham et al., 
1994) that there is much uncertainty as to whether they share a common ancestry or, instead, 
represent an example of convergent evolution. 

The need remains for the isolation of eukaryotic and prokaryotic genes and cDNAs 
encoding polypeptides involved in the carotenoid biosynthetic pathway, including those 
encoding a lycopene e-cyclase, IPP isomerase and p-carotene hydroxylase. There remains a 
need for methods to enhance the production of carotenoids, to alter the composition of 
carotenoids, and to reduce or eliminate carotenoid production. There also remains a need in 
the art for methods for screening for genes and cDNAs encoding enzymes of carotenoid 
biosynthesis and metabolism. 

OF T^ TNVFNTTON 
Accordingly, a first object of this invention is to provide purified and/or isolated 
nucleic acids which encode enzymes involved in carotenoid biosynthesis; in particular, 
lycopene e-cyclase , IPP isomerase and p-carotene hydroxylase. 

A second object of this invention is to provide purified and/or isolated nucleic acids 
which encode enzymes which produce novel or uncommon carotenoids. 

A third object of the present invention is to provide vectors containing said genes. 
A fourth object of the present invention is to provide hosts transformed with said 
vectors. 

Another object of the present invention is to provide hosts which accumulate novel or 
uncommon carotenoids or which accumulate greater amounts of specific or total carotenoids. 

Another object of the present invention is to provide hosts with inhibited and/or 
altered carotenoid production. 
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Another object of this invention is to secure the expression of eukaryotic carotenoid- 
related genes in a recombinant prokaryotic host. 

Yet another object of the present invention is to provide a method for screening for 
eukaryotic and prokaryotic genes and cDNAs which encode enzymes involved in carotenoid 

biosynthesis and metabolism. 

An additional object of the invention is to provide a method for manipulating 
carotenoid biosynthesis in photosynthetic organisms by inhibiting the synthesis of certain 
enzymatic products to cause accumulation of precursor compounds. 

Another object of the invention is to provide modified lycopene e-cyclase, IPP 
isomerase and p-carotene hydroxylase. 

These and other objects of the present invention have been realized by the present 

inventors as described below. 

A subject of the present invention is an isolated and/or purified nucleic acid sequence 
which encodes for a protein having lycopene e-cyclase, IPP isomerase or p-carotene 
5 hydroxylase enzyme activity and having the amino acid sequence of SEQ ID NOS: 2, 4, 14- 
21 or 23-27. 

The invention also includes vectors which comprise any of the nucleic acid sequences 
listed above, and host cells transformed with such vectors. 

Another subject of the present invention is a method of producing or enhancing the 
20 production of a carotenoid in a host cell, comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which encodes for a protein having 
lycopene e-cyclase, IPP isomerase or p-carotene hydroxylase enzyme activity, wherein the 
heterologous nucleic acid sequence is operably linked to a promoter; and expressing the 
heterologous nucleic acid sequence to produce the protein. 
25 Yet another subject of the present invention is a method of modifying the production 

of carotenoids in a host cell, the method comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which produces an RNA and/or encodes for 
a protein which modifies lycopene e-cyclase, IPP isomerase or P-carotene hydroxylase 
enzyme activity, relative to an untransformed host cell, wherein the heterologous nucleic acid 
30 sequence is operably linked to a promoter; and expressing the heterologous nucleic acid 

sequence in the host cell to modify the production of the carotenoids in the host cell, relative 
to the untransformed host cell. 
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The present invention also includes a method of expressing, in a host cell, a 
heterologous nucleic acid sequence which encodes for a protein having lycopene e-cyclase, 
IPP isomerase or p-carotene hydroxylase enzyme activity, the method comprising inserting 
into the host cell a vector comprising the heterologous nucleic acid sequence, wherein the 
heterologous nucleic acid sequence is operably linked to a promoter; and expressing the 
heterologous nucleic acid sequence. 

Also included is a method of expressing, in a host cell, a heterologous nucleic acid 
sequence which encodes for a protein which modifies lycopene e-cyclase, IPP isomerase or 
P-carotene hydroxylase enzyme activity in the host cell, relative to an untransformed host 
cell, the method comprising inserting into the host cell a vector comprising the heterologous 
nucleic acid sequence, wherein the heterologous nucleic acid sequence is operably linked to a 
promoter; and expressing the heterologous nucleic acid sequence. 

Another subject of the present invention is a method for screening for genes and 
cDNAs which encode enzymes involved in carotenoid biosynthesis and metabolism. 

FK IFF nPSPRTPT TOM r>F THE PR AWTNGS 
A more complete appreciation of the invention and many of the attendant advantages 
thereof will be readily obtained as the same becomes better understood by reference to the 
following detailed description when considered in connection with the accompanying 
drawings, wherein: 

Figure 1 is a schematic representation of the putative pathway of p-carotene 
biosynthesis in cyanobacteria, algae and plants. The enzymes catalyzing various steps are 
indicated at the left. Target sites of the bleaching herbicides NFZ and MPTA are also 
indicated at the left. Abbreviations: DMAPP, dimethylallyl pyrophosphate; FPP, famesyl 
pyrophosphate; GGPP, geranylgeranyl pyrophosphate; GPP, geranyl pyrophosphate; IPP, 
isopentenyl pyrophosphate; LCY, lycopene cyclase; MVA, mevalonic acid; MPTA, 2-(4- 
methylphenoxy)triethylamine hydrochloride; NFZ, norflurazon; PDS, phytoene desaturase; 
PSY, phytoene synthase; ZDS, C-carotene desaturase; PPPP, prephytoene pyrophosphate. 

Figure 2 depicts possible routes of synthesis of cyclic carotenoids and common plant 
and algal xanthophylls (oxycarotenolds) from neurosporene. Demonstrated activities of the 
P- and e-cyclase enzymes of A. thaliana are indicated by bold arrows labelled with P or e 
respectively. A bar below the arrow leading to e-carotene indicates that the enzymatic 
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activity was examined but no product was detected. The steps marked by an arrow with a 
dotted line have not been specifically examined. Conventional numbering of the carbon 
atoms is given for neurosporene and a-carotene. Inverted triangles (▼) mark positions of the 
double bonds introduced as a consequence of the desaturation reactions. 

Figure 3 depicts the carotene endgroups which are found in plants. 

Figure 4 is a DNA sequence and the predicted amino acid sequence of a lycopene e- 
cyclase cDNA isolated from A. thaliana (SEQ ID NOS: 1 and 2). These sequences were 
deposited under Genbank accession number U50738. This cDNA is incorporated into the 
plasmid pATeps. 

Figure 5 is a DNA sequence encoding the p -carotene hydroxylase isolated from A. 
thaliana (SEQ ID NO: 3). This cDNA is incorporated into the plasmid pATOHB. 

Figure 6 is an alignment of the predicted amino acid sequences of A. thaliana p- 
carotene hydroxylase (SEQ ID NO: 4) with those of the bacterial p-carotene hydroxylase 
enzymes from Alicalgenes sp. (SEQ ID NO: 5) (Genbank D58422), Erwinia herbicola EholO 
(SEQ ID NO.: 6) (GenBank M872280), Erwinia uredovora (SEQ ID NO.: 7) (GenBank 
D90087) and Agrobacterium aurianticum (SEQ ID NO.: 8) (GenBank D58420). A 
consensus sequence is also shown. All five genes are identical where a capital letter appears 
in the consensus. A lowercase letter indicates that three of five, including A thaliana, have 
the identical residue. TM; transmembrane. 

Figure 7 is a DNA sequence of a cDNA encoding an IPP isomerase isolated from A. 
thaliana (SEQ ID NO: 9). This cDNA is incorporated into the plasmid pATDP5. 

Figure 8 is a DNA sequence of a second cDNA encoding another IPP isomerase 
isolated from A. thaliana (SEQ ID NO: 10). This cDNA is incorporated into the plasmid 
pATDP7. 

Figure 9 is a DNA sequence of a cDNA encoding an IPP isomerase isolated from 
Haematococcuspluvialis (SEQ ID NO: 1 1). This cDNA is incorporated into the plasmid 
pHP04. 

Figure 10 is a DNA sequence of a second cDNA encoding another IPP isomerase 
isolated from Haematococcus pluvialis (SEQ ID NO: 12). This cDNA is incorporated into 
the plasmid pHP05. 

Figure 1 1 is an alignment of the amino acid sequences predicted by IPP isomerase 
cDNAs isolated fromA thaliana (SEQ ID NO.: 1 6 and 18), H. pluvialis (SEQ ID NOS.: 14 
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and 1 5), Clarkia breweri (SEQ ID NO.: 1 7) (See, Blanc & Pichersky, Plant Physiol. (1995) 
108:855; Genbank accession no. X82627) and Saccharomyces cerevisiae (SEQ ID NO.: 19) 
(Genbank accession no. J05090). 

Figure 12 is a DNA sequence of the cDNA encoding an IPP isomerase isolated from 
Tagetes erecta (marigold; SEQ ID NO: 1 3). This cDNA is incorporated into the plasmid 
pPMDPl. xxx's denote a region not originally sequenced. Figure 21 A shows the complete 
marigold sequence. 

Figure 13 is an alignment of the consensus sequence of four plant P-cyclases (SEQ ID 
NO.: 20) with the A. thaliana lycopene e-cyclase (SEQ ID NO.: 21). A capital letter in the 
plant p consensus is used where all four P-cyclase genes predict the same amino acid residue 
in this position. A small letter indicates that an identical residue was found in three of the 
four. Dashes indicate that the amino acid residue was not conserved and dots in the sequence 
denote a gap. A consensus for the aligned sequences is given, in capital letters below the 
alignment, where the p- and e-cyclases have the same amino acid residue. Arrows indicate 
some of the conserved amino acids that will be used as junction sites for construction of 
chimeric cyclases with novel enzymatic activities. Several regions of interest including a 
sequence signature indicative of a dinucleotide-binding motif and two predicted 
transmembrane (TM) helical regions are indicated below the alignment and are underlined. 

Figure 14 shows the nucleotide (SEQ ID NO:22) and amino acid sequences (SEQ TP 
NQ:23) of the Adonis palaestina (pheasant's eye) e-cyclase cDNA #5 . 

Figure 15A shows the nucleotide (SEQ ID NO:24) and amino acid sequences (SEQ^ 
ID NO:25) of a potato e-cyclase cDNA. Figure 15B shows the amino acid sequence (SEQ ID 
~N026) of a chimeric lettuce/potato lycopene e-cyclase. Amino acids in lower case are from 
the lettuce cDNA and those in upper case are from the potato cDNA. The product of this 
chimeric cDNA has e-cyclase activity and converts lycopene to the monocyclic 6-carotene. 

Figure 16 shows a comparison between the amino acid sequences of the Arabidopsis 
e-cyclase (SEQ ID NO:27) and the potato e-cyclase (SEQ ID NO:25). 

Figure 17A shows the nucleotide sequence of the Adonis palaestina Ipil (SEQ ID 
NO:28) and Figure 17B shows the nucleotide sequence of the Adonis palaestina Ipi2 (SEQ 
ID NO: 29). 
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Figure 18A shows the nucleotide sequence of ihe Haematoccus pluvialis Ipil (SEQ 
ID NO:l 1) and Figure 18B shows the nucleotide sequence of the Haematoccus pluvialis Ipi2 
(SEQ ID NO:30). 

Figure 19A shows the nucleotide sequence of the Lactuca sativa (romaine lettuce) 
5 Ipi 1 (SEQ ID NO:3 1 ) and Figure 1 9B shows the nucleotide sequence of the Lactuca sativa 

Ipi2 (SEQ ID NO: 32). 

Figure 20 shows the nucleotide sequence of the Chlamydomonas reinhardtii Ipil 

(SEQ ID NO:33). 

Figure 21 A shows the nucleotide sequence of the Tagetes erecta (marigold) Ipil (SEQ 
1 0 ID NO:34) and Figure 21B shows the nucleotide sequence of the Oryza sativa (rice) Ipi 1 
(SEQ ID NO:35). 

\ *7~7 Figure\t2 shows a amino acid sequence alignment of various plant and green algal 
WJJQ \Lisi >pentenyl isoLrases (IPI) (SEQ ID NOS:16, 36-45). 

SJ Figure 23\shows a comparison between Adonis palaestina e-cyclase cDNA #3 and 

[1 5 cDNA #5 nucleotide sequences. 

Figure 24 sVpws a comparison between Adonis palaestina e-cyclase cDNA #3 and 
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cDNA #5 predictecfl|Wi^ sequences. 
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Figure 25 shbws a sequence alignment of various plant 0- and e-cyclases. Those 
5 sequences outlined ir\ grey denote identical sequences among the e-cyclases. Those 

ImO sequences outlined inVlack denote identical sequences among both the p- and e-cyclases. 

Figure 26 shows a sequence alignment of the plant e-cyclases from Figure 25. Those 
sequences outlined in b\ick denote identical sequences among the e-cyclases. 

Figure 27 is a dendrogram or "tree" illustrating the degree of amino acid sequence 
similarity for various lycopene (J- and e-cyclases. 
25 FigW 28 shows a ^P 3 " 8011 between Arabidopsis e-cyclase and lettuce e-cyclase 

^ predicted anW acid sequences. 



pFgPRTPTTON QF THF PRF FFPRFn F.TvfROnTMF.NTS 
The present invention includes an isolated and/or purified nucleic acid sequence 
which encodes for a protein having lycopene e-cyclase, IPP isomerase or p-carotene 
30 hydroxylase enzyme activity and having the amino acid sequence of SEQ ID NOS: 2, 4, 14- 
21, 23 or 25-27. Nucleic acids encoding lycopene e-cyclase, p-carotene hydroxylase and IPP 
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isomerases have been isolated from several genetically distant sources. 

The present inventors have isolated nucleic acids encoding the enzyme IPP isomerase, 
which catalyzes the reversible conversion of isopentenyl pyrophosphate (IPP) to 
dimethylallyl pyrophosphate (DMAPP). IPP isomerase cDNAs were isolated from the plants 
5 A. thaliana, Tagetes erecta (marigold), Adonis palaestina (pheasant's eye), Lactuca sativa 
(romaine lettuce) and from the green algae H. pluvialis and Chlamydomonas reinhardtii. 
Alignments of the amino acid sequences predicted by some of these cDNAs are shown in 
Figures 12 and 22. Plasmids containing some of these cDNAs were deposited with the 
American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on March 4, 
10 1996 under ATCC accession numbers 98000 (pHP05 - H. pluvialis); 98001 (pMDPl - 

marigold); 98002 (pATDP7 - A. thaliana) and 98004 (pHP04 - H. pluvialis). 
„ The present inventors have also isolated nucleic acids encoding the enzyme P - 

carotene hydroxylase, which is responsible for hydroxylating the p-endgroup in carotenoids. 
Q The nucleic acid of the present invention is shown in SEQ ID NO: 3 and Figure 5. The full 
jj5 length cDN A product hydroxylates both end groups of P -carotene as do products of cDN As 
which encode proteins truncated by up to 50 amino acids from the N-terminus. Products of 
r genes which encode proteins truncated between about 60-1 1 0 amino acids from the N- 

% terminus preferentially hydroxylate only one ring. A plasmid containing this gene was 
5 deposited with the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 

c;io 20852 on March 4, 1996 under ATCC accession number 98003 (pATOHB - A. thaliana). 
' ?A The present inventors have also isolated nucleic acids encoding the enzyme lycopene 

e-cyclase, which is responsible for the formation of e-endgroups in carotenoids. The A. 
thaliane e-cyclase adds an e ring to only one end of the symmetrical lycopene while the 
related p-cyclase adds a ring at both ends. The A thaliana cDNA of the present invention is 
25 shown in Figure 4 and SEQ ID NO: 1 . A plasmid containing this gene was deposited with 
the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on 
March 4, 1996 under ATCC accession number 98005 (pATeps - A. thaliana). 

In addition, lycopene e-cyclases have been identified in lettuce and in Adonis 
palaestina (cDNA #5) which encode enzymes that convert lycopene to the bicyclic e- 
30 carotene (e,e-carotene). An additional cDN A from Adonis palaestina (cDN A #3) encodes a 
lycopene e-cyclase which converts lycopene into 6-carotene (e,»|» -carotene) and differs from 
the lycopene e-cyclase which forms bicyclic e-carotene (e,e-carotene) by only 5 amino acids. 
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One or more of these amino acids may be modified by alteration of the nucleotide sequence 
in the #5 cDNA to obtain an enzyme which forms the bicyclic e,e-carotene. The sequences 
of ihs Adonis palaestina and Arabidopsis thaliana e-cyclases have about 70% nucleotide 
identity and about 72% amino acid identity. 

Initial experiments by the inventors with chimeric genes indicated that the part of the 
e-cyclase which is responsible for adding 2 e rings to form e,e-carotene is the carboxy 
terminal portion of the gene. The lettuce e-cyclase adds two e rings to form e,e-carotene. A 
DNA encoding a partial potato e-cyclase (missing its amino terminal portion), when 
combined with an amino terminal region from the lettuce e-cyclase gene, produces a 
monocyclic 6 -carotene (e,i|r -carotene). With the discovery of the differences between the 
Adonis palaestina clone #3 and clone #5, the specific amino acids responsible for the addition 
of an extra e ring have been identified (Figure 24). Specifically, amino acid 55 is Thr in 
clone #3 and Ser in clone #5, amino acid 210 is Asn in clone #3 and Asp in clone #5, amino 
acid 231 is Asp in clone #3 and Glu in clone #5, amino acid 352 is He in clone #3 and Val in 
clone #5, and amino acid 524 is Lys in clone #3 and Arg in clone #5. It can be appreciated 
that these changes are quite conservative, as only one change, at amino acid 210, changes the 
charge of the protein. 

Thus, it is clear that the nucleic acids of the invention encoding the enzymes as 
presently disclosed may be altered to increase a particularly desirable property of the enzyme, 
to change a property of the enzyme, or to diminish an undesirable property of the enzyme. 
Such modifications can be by deletion, substitution, or insertion of one or more amino acids, 
and can be performed by routine enzymatic manipulation of the nucleic acid encoding the 
enzyme (such as by restriction enzyme digestion, removal of nucleotides by mung bean 
nuclease or Bal2>\, insertion of nucleotides by Klenow fragment, and by religation of the 
ends), by site-directed mutagenesis, or may be accidental, such as by low fidelity PCR or 
those obtained through mutations in hosts that are producers of the enzymes. These 
techniques as well as other suitable techniques are well known in the art. 

Mutations can be made in the nucleic acids of the invention such that a particular 
codon is changed to a codon which codes for a different amino acid. Such a mutation is 
generally made by making the fewest nucleotide changes possible. A substitution mutation 
of this sort can be made to change an amino acid in the resulting protein in a non- 
conservative manner (i.e., by changing the codon from an amino acid belonging to a grouping 
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of amino acids having a particular size or characteristic to an amino acid belonging to another 
grouping) or in a conservative manner (i.e., by changing the codon from an amino acid 
belonging to a grouping of amino acids having a particular size or characteristic to an amino 
acid belonging to the same grouping). Such a conservative change generally leads to less 
change in the structure and function of the resulting protein. A non-conservative change is 
more likely to alter the structure, activity or function of the resulting protein. The present 
invention should be considered to include sequences containing conservative changes which 
do not significantly alter the activity or binding characteristics of the resulting protein. 

The following is one example of various groupings of amino acids: 
Ami r „ .PiHc with r™r" lar v r nnps: Alanine, Valine, Leucine, Isoleucine, Proline, 
Phenylalanine, Tryptophan and Methionine. 

A r .™ .rite with m rhBT" 1 V nUrJi P™"!*- Glycine, Serine, Threonine, Cysteine, Tyrosine, 
Asparagine and Glutamine. 

Ami r „ ^Hc with rh ^A pniT* ornnns (negatively charged at Ph 6.0): Aspartic acid and 
Glutamic acid. 

pacirsminn acids (positively charged at pH 6.0): Lysine, Arginine and Histidine. 

Another grouping may be those amino acids with phenyl groups: Phenylalanine, 

Tryptophan and Tyrosine. 

Another grouping may be according to molecular weight (i.e., size of R groups). 

Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be maintained; 

- Glu for Asp and vice versa such that a negative charge may be maintained; 

- Ser for Thr such that a free -OH can be maintained; and 

- Gin for Asn such that a free NH 2 can be maintained. 

Amino acid substitutions may also be introduced to substitute an amino acid with a 
particularly preferable property. For example, a Cys may be introduced to provide a potential 
site for disulfide bridges with another Cys. A His may be introduced as a particularly 
"catalytic" site (i.e., His can act as an acid or base and is the most common amino acid in 
biochemical catalysis). Pro may be introduced because of its particularly planar structure, 
which induces P -turns in the protein's structure. 

It is clear that certain modifications of SEQ ID NOS: 2, 4, 14-21, 23 or 25-27 can take 
place without destroying the activity of the enzyme. It is noted especially that truncated 



WO 99/63055 PCT/US99/12I21 

versions of the nucleic acids of the invention are functional. For example, several amino 
acids (from 1 to about 120) can be deleted from the N-terminus of the lycopene e-cyclases of 
the invention, and a functional protein can still be produced. This fact is made especially 
clear from Figure 25, which shows a sequence alignment of several plant e-cyclases. As can 
be seen from Figure 25, there is an enormous amount of sequence disparity between amino 
acid sequences 2 to about 50-70 (depending on the particular sequence, since gaps are 
present). There is less, but also a substantial amount of, sequence dissimilarity between about 
50-70 to about 90-120 (depending on the particular sequence). Thereafter, the sequences are 
fairly conserved, except for small pockets of dissimilarity between about 275-295 to about 
285-305 (depending on the particular sequence), and between about 395-415 to about 410- 
430 (depending on the particular sequence). 

The present inventors have found that the amount of the 5* region present in the 
nucleic acids of the invention can alter the activity of the enzyme. Instead of diminishing 
activity, truncating the 5' region of the nucleic acids of the invention may result in an enzyme 
with a different specificity. Thus, the present invention relates to nucleic acids and enzymes 
encoded thereby which are truncated to within 0-50, preferably 0-25, codons of the 5' 
initiation codon of their prokaryotic counterparts as determined by alignment maps as 
discussed below. 

For example, when the cDNA encoding A. thaliana p -carotene hydroxylase was 
truncated, the resulting enzyme catalyzed the formation of p-cryptoxanthin as the major 
product and zeaxanthin as minor product; in contrast to its normal production of zeaxanthin. 

The present invention is intended to include those nucleic acid and amino acid 
sequences in which substitutions, deletions, additions or other modifications have taken 
place, as compared to SEQ ID NOS: 2, 4, 14-21, 23 or 25-27, without destroying the activity 
of the enzyme. Preferably, the substitutions, deletions, additions or other modifications take 
place at the 5' end, or any other of those positions which already show dissimilarity between 
any of the presently disclosed amino acid sequences (see also Figure 25) or other amino acid 
sequences which are known in the art and which encode the same enzyme (i.e., lycopene €- 
cyclase, IPP isomerase or P -carotene hydroxylase). 

In each case, nucleic acid and amino acid sequence similarity and identity is measured 
using sequence analysis software, for example, the Sequence Analysis, Gap, or BestFit 
software packages of the Genetics Computer Group (University of Wisconsin Biotechnology 
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Center, 1710 University Avenue, Madison, Wisconsin 53705), MEGAlign (DNAStar, Inc., 
1228 S. Park St., Madison, Wisconsin 53715), or Mac\^ctor (Oxford Molecular Group, 2105 
S. Bascom Avenue, Suite 200, Campbell, California 95008). Such software uses algorithms 
to match similar sequences by assigning degrees of identity to various substitutions, 
deletions, and other modifications, and includes detailed instructions as to useful parameters, 
etc., such that those of routine skill in the art can easily compare sequence similarities and 
identities. An example of a useful algorithm in this regard is the algorithm of Needleman and 
Wunsch, which is used in the Gap program discussed above. This program finds the 
alignment of two complete sequences that maximizes the number of matches and minimizes 
the number of gaps. Another useful algorithm is the algorithm of Smith and Waterman, 
which is used in the BestFit program discussed above. This program creates an optimal 
alignment of the best segment of similarity between two sequences. Optimal alignments are 
found by inserting gaps to maximize the number of matches using the local homology 
algorithm of Smith and Waterman. 

Conservative (i.e. similar) substitutions typically include substitutions within the 
following groups: glycine and alanine; valine, isoleucine and leucine; aspartic acid, glutamic 
acid, asparagine and glutamine; serine and threonine; lysine and arginine; and phenylalanine 
and tyrosine. Substitutions may also be made on the basis of conserved hydrophobicity or 
hydrophilicity (see Kyte and Doolittle, J. Mol. Biol. 157: 105-132 (1982)), or on the basis of 
the ability to assume similar polypeptide secondary structure (see Chou and Fasman, Adv. 
Enzymol. 47: 45-148 (1978)). 

If comparison is made between nucleotide sequences, preferably the length of 
comparison sequences is at least 50 nucleotides, more preferably at least 60 nucleotides, at 
least 75 nucleotides or at least 100 nucleotides. It is most preferred if comparison is made 
between the nucleic acid sequences encoding the enzyme coding regions necessary for 
enzyme activity. If comparison is made between amino acid sequences, preferably the length 
of comparison is at least 20 amino acids, more preferably at least 30 amino acids, at least 40 
amino acids or at least 50 amino acids. It is most preferred if comparison is made between 
the amino acid sequences in the enzyme coding regions necessary for enzyme activity. 

It should be appreciated that also within the scope of the present invention are nucleic 
acid sequences encoding lycopene e-cyclases, EPP isomerases and p -carotene hydroxylases 
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which code for enzymes having the same amino acid sequence as SEQ ID NOS: 2, 4, 14-21, 
23 or 25-27, but which are degenerate to the nucleic acids specifically disclosed herein. 

The amino acid residues described herein are preferred to be in the "L" isomeric form. 
However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, 
as long as the desired functional property of immunoglobulin-binding is retained by the 
polypeptide. 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the skill of the 
art. Such techniques are explained fully in the literature. See, e.g., Sambrook et al, 
"Molecular Cloning: A Laboratory Manual" (1989); "Current Protocols in Molecular 
Biology" Volumes I-III [Ausubel, R. M., ed. (1994)]; "Cell Biology: A Laboratory 
Handbook" Volumes I-III [J. E. Celis, ed. (1994))]; "Current Protocols in Immunology" 
Volumes I-III [Coligan, J. E., ed. (1994)]; "Oligonucleotide Synthesis" (M.J. Gait ed. 1984); 
"Nucleic Acid Hybridization" [B.D. Hames & S.J. Higgins eds. (1985)]; "Transcription And 
Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" [R.I. 
Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [IRL Press, (1986)]; B. Perbal, "A 
Practical Guide To Molecular Cloning" (1984). 

The present invention also includes vectors. Suitable vectors according to the present 
invention comprise a nucleic acid of the invention encoding an enzyme involved in 
carotenoid biosynthesis or metabolism and a suitable promoter for the host, and can be 
constructed using techniques well known in the art (for example Sambrook et al., Molecular 
p™in g A T ab^tnry Manual Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 
1989; Ausubel et al., f^m-m Prntom ic in Mn1r.rnlar Biology. Greene Publishing and Wiley 
Interscience, New York, 1991). Suitable vectors for eukaryotic expression in plants are 
described in Frey et al., Plant J. (1995) 8(5):693 and Misawa et al, 1994a; incorporated herein 
by reference. Suitable vectors for prokaryotic expression include pACYCl 84, pUCl 19, and 
pBR322 (available from New England BioLabs, Bevery, MA) and pTrcHis (Invitrogen) and 
pET28 (Novagen) and derivatives thereof. The vectors of the present invention can 
additionally contain regulatory elements such as promoters, repressors, selectable markers 
such as antibiotic resistance genes, etc. 

The nucleic acids encoding the carotenoid enzymes as described above, when cloned 
into a suitable expression vector, can be used to overexpress these enzymes in a plant 



-14- 



WO 99/63055 ^ PCT/US99/12121 

expression system or to inhibit the expression of these enzymes. For example, a vector 
containing the gene encoding lycopene e-cyclase can be used to increase the amount of a- 
carotene and carotenoids derived from a-carotene (such as lutein and a-cryptoxanthin) in an 
organism and thereby alter the nutritional value, pharmacology and visual appearance value 
of the organism. 

Therefore, the present invention includes a method of producing or enhancing the 
production of a carotenoid in a host cell, relative to an untransformed host cell, the method 
comprising inserting into the host cell a vector comprising a heterologous nucleic acid 
sequence which encodes for a protein having lycopene e-cyclase, IPP isom erase or P- 
carotene hydroxylase enzyme activity, wherein the heterologous nucleic acid sequence is 
operably linked to a promoter; and expressing the heterologous nucleic acid sequence to 
produce the protein. 

The present invention also includes a method of modifying the production of 
carotenoids in a host cell, the method comprising inserting into the host cell a vector 
comprising a heterologous nucleic acid sequence which produces an RNA and/or encodes for 
a protein which modifies lycopene e-cyclase, IPP isomerase or p -carotene hydroxylase 
enzyme activity, relative to an untransformed host cell, wherein the heterologous nucleic acid 
sequence is operably linked to a promoter; and expressing the heterologous nucleic acid 
sequence in the host cell to modify the production of the carotenoids in the host cell, relative 
to the untransformed host cell. 

The term "modifying the production" means that the amount of carotenoids produced 
in the host cell can be enhanced, reduced, or left the same, as compared to the untransformed 
host cell. In accordance with one embodiment of the present invention, the make-up of the 
carotenoids (i.e., the specific carotenoids produced) is changed vis a vis each other, and this 
change in make-up may result in either a net gain, net loss, or no net change in the total 
amount of carotenoids produced in the cell. In accordance with another embodiment of the 
present invention, the production or the biochemical activity of the carotenoids (or the 
enzymes which catalyze their formation) is enhanced by the insertion of an enzyme-encoding 
nucleic acid of the invention. In yet another embodiment of the invention, the production or 
the biochemical activity of the carotenoids (or the enzymes which catalyze their formation) 
may be reduced or inhibited by a number of different approaches available to those skilled in 
the art, including but not limited to such methodologies or approaches as anti-sense (e.g., 
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Gray et al (1992) Plant Mol. Biol. 19:69-87), ribozymes (e.g., Wegener et al (1994) Mol. 
Gen. Genet. 245:465-470), co-suppression (e.g., Fray and Grierson (1993) Plant Mol. Biol. 
22:589-602), targeted disruption of the gene (e.g., Schaefer et al. (1997) Plant J. 1 1:1195- 
1206), intracellular antibodies (e.g., Rondon and Marasco (1997) Ann. Rev. Microbiol. 
51:257-283) or whatever other approaches rely on the knowledge or availability of the nucleic 
acid or amino acid sequences of the invention and/or portions thereof, to thereby reduce 
accumulation of carotenoids with e rings and compounds derived from them (for e-cyclase 
inhibition), or carotenoids with hydroxylated P rings and compounds derived from them (for 
P -hydroxylase inhibition), or, in the case if IPP isomerase, accumulation of any isoprenoid 
compound. 

Preferably, at least a portion of the nucleic acid sequences used in the methods, 
vectors and host cells of the invention codes for an enzyme having an amino acid sequence 
which is at least 85% identical, preferably at least 90%, at least 95% or completely identical 
to SEQ ID NOS: 2, 4, 14-21, 23 or 25-27. Sequence identity is determined as noted above. 
Preferably, sequence additions, deletions or other modifications are made as indicated above, 
so as to not affect the function of the particular enzyme. 

In a preferred embodiment, vectors are manufactured which contain a DNA encoding 
a eukaryotic IPP isomerase upstream of a DNA encoding a second eukaryotic carotenoid 
enzyme. The inventors have discovered that inclusion of an IPP isomerase gene increases the 
supply of substrate for the carotenoid pathway; thereby enhancing the production of 
carotenoid endproducts, as compared to a host cell which is not transformed with such a 
vector. This is apparent from the much deeper pigmentation in carotenoid-accumulating 
colonies of E. coli which also contain one of the aforementioned IPP isomerase genes when 
compared to colonies that lack this additional IPP isomerase gene. Similarly, a vector 
comprising an IPP isomerase gene can be used to enhance production of any secondary 
metabolite of dimethylallyl pyrophosphate and/or isopentenyl pyrophosphate (such as 
isoprenoids, steroids, carotenoids, etc.). The term "isoprenoid" is intended to mean any 
member of the class of naturally occurring compounds whose carbon skeletons are composed, 
in part or entirely, of isopentyl C 5 units. Preferably, the carbon skeleton is of an essential oil, 
a fragrance, a rubber, a carotenoid, or a therapeutic compound, such as paclitaxel. 

A vector containing the cDNA encoding a lycopene e-cyclase of the invention, 
preferably the lettuce lycopene e-cyclase or Adonis e-cyclase #5, can be used to increase the 
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amount of bicyclic e-carotene in an organism and thereby alter the nutritional value, 
pharmacology and visual appearance value of the organism. In addition, the transformed 
organism can be used in the formulation of therapeutic agents, for example in the treatment of 
cancer (see Mayne et al (1996) FASEB J. 10:690-701 ; Tsushima et al (1995) Biol. Pharm. 
Bull. 18:227-233). 

An antisense strand of a nucleic acid of the invention can be inserted into a vector. 
For example, the lycopene e-cyclase gene can be inserted into a vector and incorporated into 
the genomic DNA of a host, thereby inhibiting the synthesis of e,P-carotenoids (lutein and a- 
carotene) and enhancing the synthesis of p,p -carotenoids (zeaxanthin and p -carotene). 

The present invention also relates to novel enzymes which are encoded by the amino 
acid sequences of the invention, or portions thereof. 

The present invention also relates to novel enzymes which can transform known 
carotenoids into novel or uncommon products. Currently e-carotene (see Figure 2) and y- 
carotene are commonly produced only in minor amounts. As described below, an enzyme 
can be produced which transforms lycopene to y -carotene and lycopene to e-carotene. With 
these products in hand, bulk synthesis of other carotenoids derived from them are possible. 
For example, e-carotene can be hydroxylated to form lactucaxanthin, an isomer of lutein (one 
e and one p ring) and zeaxanthin (two p rings) where both endgroups are, instead, e rings. 

In addition to novel enzymes produced by truncating the 5' region of known enzymes, 
as discussed above, novel enzymes which can participate in the formation of unusual 
carotenoids can be formed by replacing portions of one gene with an analogous sequence 
from a structurally related gene. For example, p -cyclase and e-cyclase are structurally 
related (see Figure 13). By replacing a portion of P -lycopene cyclase with the analogous 
portion of e-cyclase, an enzyme which produces y -carotene will be produced (one P 
endgroup). Further, by replacing a portion of the lycopene e-cyclase with the analogous 
portion of P -cyclase, an enzyme which produces e-carotene will be produced (with some 
exceptions, such as the lettuce e-cyclase, plant e-cyclases normally produce a compound with 
one e-endgroup, 6-carotene). Similarly, p-hydroxylase could be modified to produce 
enzymes of novel function by creation of hybrids with e-hydroxylase. 

Host systems according to the present invention can comprise any organism that 
already produces carotenoids or which has been genetically modified to produce carotenoids. 



-17- 



WO 99/63055 PCT/US99/12121 

The IPP isomerase genes are more broadly applicable for enhancing production of any 
product dependent on DMAPP and/or IPP as a precursor. 

Organisms which already produce carotenoids include plants, algae, some yeasts, 
fungi and cyanobacteria and other photosynthetic bacteria. Transformation of these hosts 
with vectors according to the present invention can be done using standard techniques such as 
those described in Misawa et al., (1990) supra; Hundle et al., (1993) supra; Hundle et al., 
(1991) supra; Misawa et al., (1991) supra; Sandmann et al., supra; and Schnurr et al., supra. 

Transgenic organisms can be constructed which include the nucleic acid sequences of 
the present invention (Bird et al, 1991; Bramley et al, 1992; Misawa et al, 1994a; Misawa et 
al, 1994b; Cunningham et al, 1993). The incorporation of these sequences can allow the 
controlling of carotenoid biosynthesis, content, or composition in the host cell. These 
transgenic systems can be constructed to incorporate sequences which allow for the 
overexpression of the nucleic acids of the present invention. Transgenic systems can also be 
constructed containing antisense expression of the nucleic acid sequences of the present 
invention. Such antisense expression would result in the accumulation of the substrates of the 
substrates of the enzyme encoded by the sense strand. 

A method for screening for eukaryotic genes which encode enzymes involved in 
carotenoid biosynthesis comprises transforming a prokaryotic host with a nucleic acid which 
may contain a eukaryotic or prokaryotic carotenoid biosynthetic gene; culturing said 
transformed host to obtain colonies; and screening for colonies exhibiting a different color 
than colonies of the untransformed host. 

Suitable hosts include E. coli, cyanobacteria such as Synechococcus and 
Synechocystis, alga and plant cells. E. coli are preferred. 

In a preferred embodiment, the above "color complementation" screening protocol can 
be enhanced by using mutants which are either (1) deficient in at least one carotenoid 
biosynthetic gene or (2) overexpress at least one carotenoid biosynthetic gene. In either case, 
such mutants will accumulate carotenoid precursors. 

Prokaryotic and eukaryotic DNA or cDNA libraries can be screened in total for the 
presence of genes of carotenoid biosynthesis, metabolism and degradation. Preferred 
organisms to be screened include photosynthetic organisms. 
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E. coli can be transformed with these eukaryotic cDNA libraries using conventional 
methods such as those described in Sambrook et al, 1989 and according to protocols 
described by the vendors of the cloning vectors. 

For example, the cDNA libraries in bacteriophage vectors such as lambdaZAP 
(Stratagene) or lambda ZIPLOX (Gibco BRL) can be excised en masse and used to transform 
E.coli. 

Transformed E. coli can be cultured using conventional techniques. The culture broth 
preferably contains antibiotics to select and maintain plasmids. Suitable antibiotics include 
penicillin, ampicillin, chloramphenicol, etc. Culturing is typically conducted at 15-40°C, 
preferably at room temperature or slightly above (18-28°C), for 12 hours to 7 days. 

Cultures are plated and the plates are screened visually for colonies with a different 
color than the colonies of the host E. coli transformed with the empty plasmid cloning vector. 
For example, E. coli transformed with the plasmid, pAC-BETA (described below), produce 
yellow colonies that accumulate p -carotene. After transformation with a cDNA library, 
colonies which contain a different hue than those formed by E. coli/p AC-BET A would be 
expected to contain enzymes which modify the structure or accumulation of p -carotene. 
Similar E. coli strains can be engineered which accumulate earlier products in carotenoid 
biosynthesis, such as lycopene, y -carotene, etc. 

Having generally described this invention, a further understanding can be obtained by 
reference to certain specific examples which are provided herein for purposes of illustration 
only and are not intended to be limiting unless otherwise specified. 

EXAMPLE 

I. Isolation of P -carotene hydroxylase 
Plasmid Construction 

An 8.6kb Bglll fragment containing the carotenoid biosynthetic genes of Erwinia 
herbicola was first cloned in the BamHI site of plasmid vector pACYC184 (chloramphenicol 
resistant), and then a 1.1 kb BamHI fragment containing the E. herbicola p -carotene 
hydroxylase {CrtZ) was deleted. E,coli strains containing the resulting plasmid, p AC-BETA, 
accumulate B-carotene and form yellow colonies (Cunningham et al., 1994). 

A full length cDNA encoding IPP isom erase of Haematococcus pluvialis (HP04) was 
first excised with BamHI and Kpril from pBluescript SK-, and then ligated into the 
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corresponding sites of the pTrcHisA vector with high-level expression from the trc promoter 
(Invitrogen, Inc.). A fragment containing the DPP isomerase and trc promoter was 
subsequently excised with EcoRW and Kpnl, treated with the Klenow fragment of DNA 
polymerase to produce blunt ends, and ligated in the Klenow-treated HindUI site of pAC- 
BETA. E.coli cells transformed with this new plasmid pAC-BETA-04 form orange colonies 
on LB plates (vs. yellow for those containing pAC-BETA) and cultures accumulate 
substantially more P -carotene (ca. two fold) than those that contain pAC-BETA. 

Screening of an Arabidopsis cDNA Library 

Several X cDNA expression libraries of Arabidopsis were obtained from the 
Arabidopsis Biological Resource Center (Ohio State University, Columbus, OH) (Kieber et 
al., 1993). The X cDNA libraries were excised in vivo using Stratagene's ExAssist SOLR 
system to produce a phagemid cDNA library wherein each phagemid contained also a gene 
conferring resistance to the antibiotic ampicillin. 

E.coli strain DH10BZIP was chosen as the host cell for the screening and pigment 
production, although we have also used TOPI OF' and XLl-Blue for this purpose. DH10B 
cells were transformed with plasmid pAC-BETA-04 and were plated on LB agar plates 
containing chloramphenicol at 50 ng/ml (from United States Biochemical Corporation). The 
phagemid Arabidopsis cDNA library was then introduced into DH10B cells already 
containing pAC-BETA-04. Transformed cells containing both pAC-BETA-04 and 
Arabidopsis cDNA library phagemids were selected on chloramphenicol plus ampicillin (150 
(ig/ml) agar plates. Maximum color development occurred after 3 to 7 days incubation at 
room temperature, and the rare bright yellow colonies were selected from a background of 
many thousands of orange colonies on each agar plate. Selected colonies were inoculated 
into 3 ml liquid LB medium containing ampicillin and chloramphenicol, and cultures were 
incubated at room temperature for 1-2 days, with shaking. Ceils were then harvested by 
centrifugation and extracted with acetone in microfuge tubes. After centrifiigation, the 
pigmented extract was spotted onto silica gel thin-layer chromatography (TLC) plates, and 
developed with a hexane:ether (1 : 1 , by volume) mobile phases. 6-carotene hydroxylase- 
encoding cDNAs were identified based on the appearance of a yellow pigment that co- 
migrated with zeaxanthin on the TLC plates. 
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Suhclonin g and Sequencing 

The plasmid containing the B-carotene hydroxylase cDNA was recovered and 
analyzed by standard procedures (Sambrook et al., 1989). The Arabidopsis B-carotene 
hydroxylase was sequenced completely on both strands on an automatic sequencer (Applied 
Biosystems, Model 373 A, Version 2.0. IS). The cDNA insert of 0.95kb also was excised 
and ligated into the a pTrcHis vector. A BgUl restriction site within the cDNA was used to 
remove that portion of the cDNA that encodes the predicted polypeptide N terminal sequence 
region that is not also found in bacterial p-carotene hydroxylases (Figure 6). A Bglll-Xhol 
fragment was directionally cloned in BamHI-XhoI digested TrcHis vectors. 

Pigment Analysis 

A single colony was used to inoculate 50 ml of LB containing ampicillin and 
chloramphenicol in a 250-ml flask. Cultures were incubated at 28°C for 36 hours with gentle 
shaking, and then harvested at 5000 rpm in an SS-34 rotor. The cells were washed once with 
distilled H 2 0 and resuspended with 0.5 ml of water. The extraction procedures and HPLC 
were essentially as described previously (Cunningham et al, 1994). 

II. Isolation and biochemical analysis of an Arabidopsis lycopene e-cvclase 
Plasmid Construction 

Construction of plasmids pAC-LYC, pAC-NEUR, and pAC-ZETA is described in 
Cunningham et al., (1994). In brief, the appropriate carotenoid biosynthetic genes from 
Erwinia herbicola, Rhodobacter capsulatus, and Synechococcus sp. strain PCC7942 were 
cloned in the plasmid vector pACYCl 84 (New England BioLabs, Beverly, MA). Cultures of 
E. coli containing the plasmids pAC-ZETA, pAC-NEUR, and pAC-LYC, accumulate £- 
carotene, neurosporene, and lycopene, respectively. The plasmid pAC-ZETA was 
constructed as follows: an 8.6-kb Bgin fragment containing the carotenoid biosynthetic genes 
of E. herbicola (GenBank M87280; Hundle et al., 1991) was obtained after partial digestion 
of plasmid pPL376 (Perry et al., 1986; Tuveson et al., 1986) and cloned in the BamHI site of 
pACYC184 to give the plasmid pAC-EHER. Deletion of adjacent 0.8- and 1.1-kb BamHI- 
BamHI fragments (deletion Z in Cunningham et al., 1994), and of a 1.1 kB Sall-Sall fragment 
(deletion X) served to remove most of the coding regions for the E. herbicola P-carotene 
hydroxylase (crtZ gene) and zeaxanthin glucosyltransferase (crtX gene), respectively. The 
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resulting plasmid, pAC-BETA, retains functional genes for geranylgeranyl pyrophosphate 
synthase (crtE), phytoene synthase (crtB), phytoene desaturase (crtl), and lycopene cyclase 
(crtY). Cells of E. coli containing this plasmid form yellow colonies and accumulate P- 
carotene. A plasmid containing both the lycopene e- and P-cyclase cDNAs of A. thaliana 
was constructed by excising the e-cyclase in clone y2 as a PvuI-PvuII fragment and ligating 
this piece in the SnaBI site of a plasmid (pSPORT 1 from GIBCO-BRL) that already 
contained the p -cyclase (Cunningham et al., 1996). 

Organisms and Growth Conditions 

E. coli strains TOP 10 and TOP 10 F (obtained from Invitrogen Corporation, San 
Diego, CA) and XLl-Blue (Stratagene) were grown in Luria-Bertani (LB) medium 
(Sambrook et al., 1989) at 37° C in darkness on a platform shaker at 225 cycles per min. 
Media components were from Difco (yeast extract and tryptone) or Sigma (NaCl). 
Ampicillin at 150 ng/mL and/or chloramphenicol at 50 ^ig/mL (both from United States 
Biochemical Corporation) were used, as appropriate, for selection and maintenance of 
plasmids. 

Mass Excision and Color Complementation Screening of an A. thaliana cDNA IJhrary 
A size-fractionated 1-2 kB cDNA library of A. thaliana in lambda ZAPII (Kieber et 
al., 1993) was obtained from the Arabidopsis Biological Resource Center at The Ohio State 
University (stock number CD4-14). Other size fractionated libraries were also obtained 
(stock numbers CD4-13, CD4-15, and CD4-16). An aliquot of each library was treated to 
cause a mass excision of the cDNAs and thereby produce a phagemid library according to the 
instructions provided by the supplier of the cloning vector (Stratagene; E. coli strain XLl- 
Blue and the helper phage R408 were used). The titre of the excised phagemid was 
determined and the library was introduced into a lycopene-accumulating strain of E. coli 
TOP10 F* (this strain contained the plasmid pAC-LYC) by incubation of the phagemid with 
the £. coli cells for 15 min at 37°C. Cells had been grown overnight at 30°C in LB medium 
supplemented with 2% (w/v) maltose and 10 mM MgS0 4 (final concentration), and harvested 
in 1.5 ml micro fuge tubes at a setting of 3 on an Eppendorf micro fuge (5415C) for 10 min. 
The pellets were resuspended in 10 mM MgS0 4 to a volume equal to one-half that of the 
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initial culture volume. Transformants were spread on large (150 mm diameter) LB agar petri 
plates containing antibiotics to provide for selection of cDNA clones (ampicillin) and 
maintenance of pAC-LYC (chloramphenicol). Approximately 10,000 colony forming units 
were spread on each plate. Petri plates were incubated at 37*C for 16 hr and then at room 
temperature for 2 to 7 days to allow maximum color development. Plates were screened 
visually with the aid of an illuminated 3x magnifier and a low power stage-dissecting 
microscope for the rare, pale pinkish-yellow to deep-yellow colonies that could be observed 
in the background of pink colonies. A colony color of yellow or pinkish-yellow was taken as 
presumptive evidence of a cyclization activity. These yellow colonies were collected with 
sterile toothpicks and used to inoculate 3ml of LB medium in culture tubes with overnight 
growth at 37 °C and shaking at 225 cycles/min. Cultures were split into two aliquots in 
microfuge tubes and harvested by centrifugation at a setting of 5 in an Eppendorf 5415C 
microfuge. After discarding the liquid, one pellet was frozen for later purification of plasmid 
DNA. To the second pellet was added 1.5 ml EtOH, and the pellet was resuspended by 
vortex mixing, and extraction was allowed to proceed in the dark for 15-30 min with 
occasional remixing. Insoluble materials were pelleted by centrifugation at maximum speed 
for 10 min in a microfuge. Absorption spectra of the supernatant fluids were recorded from 
350-550 nm with a Perkin Elmer lambda six spectrophotometer. 

Analysis of isolated clones 

Eight of the yellow colonies contained p-carotene indicating that a single gene 
product catalyzes both cyclizations required to form the two p endgroups of the symmetrical 
P-carotene from the symmetrical precursor lycopene. One of the yellow colonies contained a 
pigment with the spectrum characteristic of 6 -carotene, a monocyclic carotenoid with a single 
e endgroup. Unlike the P cyclase, this e-cyclase appears unable to carry out a second 
cyclization at the other end of the molecule. 

The observation that e-cyclase is unable to form two cyclic e-endgroups {e.g. the 
bicyclic e-carotene) illuminates the mechanism by which plants can coordinate and control 
the flow of substrate into carotenoids derived from P-carotene versus those derived from a- 
carotene and also can prevent the formation of carotenoids with two e endgroups. 

The availability of the A. thaliana gene encoding the e-cyclase enables the directed 
manipulation of plant and algal species for modification of carotenoid content and 
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composition. Through inactivation of the e-cyclase, whether at the gene level by deletion of 
the gene or by insertional inactivation or by reduction of the amount of enzyme formed (by 
such as antisense technology), one may increase the formation of p -carotene and other 
pigments derived from it. Since vitamin A is derived only from carotenoids with p 
endgroups, an enhancement of the production of p -carotene versus a -carotene may enhance 
nutritional value of crop plants. Reduction of carotenoids with e-endgroups may also be of 
value in modifying the color properties of crop plants and specific tissues of these plants. 
Alternatively, where production of a -carotene, or pigments such as lutein that are derived 
from a -carotene, is desirable, whether for the color properties, nutritional value or other 
reason, one may overexpress the e-cyclase or express it in specific tissues. Wherever 
agronomic value of a crop is related to pigmentation provided by carotenoid pigments the 
directed manipulation of expression of the e-cyclase gene and/or production of the enzyme 
may be of commercial value. 

The predicted amino acid sequence of the A. thaliana e-cyclase enzyme was 
determined. A comparison of the amino acid sequences of the ,P- and e-cyclase enzymes of 
Arabidopsis thaliana (Fig. 13) as predicted by the DNA sequence of the respective cDNAs 
(Fig. 4 for the e-cyclase cDNA sequence), indicates that these two enzymes have many 
regions of sequence similarity, but they are only about 37% identical overall at the amino acid 
level. The degree of sequence identity at the DNA base level, only about 50%, is sufficiently 
low such that we and others have been unable to detect this gene by hybridization using the P 
cyclase as a probe in DNA gel blot experiments. 
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